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Abstract 

The new generation of radio telescopes, such as the Square Kilometer Array (SKA), requires 
dramatic advances in computer hardware and software, in order to process the large amounts of 
produced data efficiently. In this document, we explore a new approach to wide-held imaging. 
By generalizing the image reconstruction, which is performed by an inverse Fourier transform, 
to arbitrary transformations, we gain enormous new possibilities. In particular, we outline an 
approach that might allow to obtain a sky image of size P x Q in (optimal) 0{PQ) time. This 
could be a step in the direction of real-time, wide-held sky imaging for future telescopes. 
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1 Image synthesis via Fourier transformation 

Traditionally, a radio astronomical image or sky intensity distribution depending on 

angular coordinates f,m G [—1,1], is derived from a (non-uniformly) sampled visibility function 
V^'^\u,v,w). For a specihc frequency v (or a small band around v), and making a set of common 
assumptions, the visibility V^‘'\u,v,w) is connected to by: 

OO OO 

V‘'''\u,v,w) = J j 4^^^(Tm) A(^)(Tm) (1.1) 

— OO —OO 

where {1,'m) j\/\ — P' — vrfi denotes a modihed intensity, n = — — iv? — 1, 

and is the normalized antenna reception pattern or primary beam [5,6]. 

Each sample V^'^\uij{f),Vij{t),Wij{t)) in the (n,u,re)-space is derived by measuring the cross 
correlation of antenna pair signals {i,j), i.e. {E^'^\ri,t),E^'^'>*{rj,f)), where (•) means time-averaging, 
E^‘'\ri,f) is the electric held at position fi, and * denotes conjugate transposition.^ Evidently, for 
each measurement, we obtain two samples, as 


V^^\uij{t),Vij{t),Wij{t)) = V^^^*{-Uij{t), -Vij{t), -Wijit)). 

Consequently, given Y antennas, 2 • samples are taken at any moment in time. The terms 
{uij{t),Vij{t),Wij{t)), the so called baselines, corresponds to fi — fj measured in wavelength and 
are expressed in the (tt, u, tc)-coordinate system [5,6]. The time-dependence arrives from tracking 
a source and thereby rotating the {u, v, rc)-coordinate system. 

Several strategies justify the reduction of (1.1) to a two-dimensional Fourier transformation: 
only a small region in the sky is observed and consequently n ~ 0, all antennas are almost in a plane 
and consequently rc Ri 0, or the term is treated as a convolution of V^''\u,v,w = 0) [2,6]. 

^(1) In this document, the word antenna stands for a single antenna or a beamformed station of antennas. (2) For 
simplicity, we consider the electric fields as scalar fields. An extension vector fields is possible. 
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If the beams {£, m) are different for each antenna pair, the convolution theorem is also used to 
correct for the A-terms, before a Fourier transform is performed [1]. 

The so called dirty image is obtained by a two-dimensional Fourier transform of the (non- 
uniformly) sampled (and corrected) visibility function. Further processing is needed to get a clean 
image, which is an approximation to the true sky intensity distribution as sampling 

implies that the dirty image is a convolution of the true sky intensity with the dirty beam or 
point spread function (PSF) of the instrument, the process to obtain a clean image is also called 
deconvolution. 


2 Image synthesis via an arbitrary transformation 


2.1 The idea 

For each antenna pair (baseline), after discretizing space i = (£ 1 , £ 2 , • • •, f’p), rh = (mi, m 2 ,..., mq) 
(for now, equidistant^) and time with tq = (tij,i, tjj, 2 , • • • ,tij,Kij) (possibly baseline-dependent) in 
(1.1), we approximate the measured visibility at time tq^k by 


^ {^ij Vij (tij^kf, Wij (tij k)) 


N M 


^ ^ Am mq) mq) . (2.1) 

p=l q=l 


In (2.1), we dropped any assumption on the antenna pair beams now, they can be different 
for each antenna pair and time-dependent.^ 

The key insight to express the intensity function in a convenient basis {Tq}, with dual basis 
{Tq}, is to consider linear combinations of visibilities; 


Y r Kij 

^('kfj) — ^ ^ ^ ^ ^ id^ijifij^k) iVij{tij^k) 1 Wij(tij^kf) — 

i=l j=l k=l 


P Q Y Y Kij 


Am ll^l^jip, mq) ^ ^ ^ ® ‘^^AuiA^^irk)tp+Vij(tij^k)mq+Wij(tij^k)npg) _ 


p=l q=l 


i=l j=l k=l 


new generating vector {£, rn) 


( 2 . 2 ) 


This leads to a new generalized visibilities, {c^'^^(Tn)}, which are linear combinations of the regular 
visibilities (see Fig. 2.2); the generalized visibilities correspond to coefficients in the new generating 
set {Tq}, where fl is a parameter set (such as a discretized (n, u)-space or a set of indices). We get 

^We assume for simplicity that the samples fill the entire sky. In a more general scenario, they only cover the 
region of interest. The discretization scheme warrants a more detailed discussion. 

^We assume that the time-dependent effects are known a priori. These could be beam shape modulations or 
re-pointing of antennas. 
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(a) dim(5) < PQ 


(b) dim(V) <C dim(5) 


Figure 2.1: If dim(5) < PQ, we can recover the projection onto approximation space S, Is{i,rn). 
If we have domain-specific knowledge about coefficients that are close to zero, we obtain a good 
approximation of the image by restricting ourselves to subspace V C 5 with dim(V) ^ dim(5). 

the following relation:^ 


P Q 

p=l q=l 

P Q 

pa ^ ^ A^ Am I^Zld^lp, mq)^^{ip, rUg). (2.3) 

p=l q=l 

Eq. (2.3) defines the transformation between coefficients C'^^)('kf 2 ) = {c^'^)(^q)} and the modified 
intensity 

C“')(>K„) = . (2.4) 

In order to invert transformation 7~ efficiently, say in 0{PQ) time, we select a suitable basis 
Often, the basis is unitary, or real and orthogonal. To obtain meaningful images, “enough” dual 
basis vectors 'I'q must be well-approximated, i.e. 

II^Q-^oll<fe or 

for some norm || • || and (0-dependent) thresholds or 5q. As the {'I'o} basis spans the PQ- 
dimensional image space, we obtain a projection onto reduced approximation space S, if only 
dim(5) < PQ dual basis vectors satisfy (2.5); such a situation is visualized in Fig. 1(a). The 
coefficients for not well-represented basis vectors are set to zero. 

■^We do not discuss the approximation error in detail. However, we provide a way to bound the approximation 
error in Appendix A. 


3 



























Figure 2.2: The new basis is designed such that a fast transformation is possible (so to speak “on a 
uniform grid”). Each coefficient is a linear combination of visibilities, given as non-uniform samples 
in the (u, v, u;)-space. The black boxes indicate basis vectors that are not well approximated and 
whose coefficients cannot be obtained reliably. The gray boxes are coefficients that are known to 
be zeros due to application-knowledge. For an optimal basis, there would be very few black, but 
many gray boxes, i.e. dim(5) « PQ and dim(V) ^ dim(5). 


If we have chosen a basis, which is well-represented and is suitable for astronomical images, 
we are hopefully able to exploit further domain-specific knowledge to anticipate coefficients that 
are zero or close to zero (see gray boxes in Fig. 2.2). In such a case, the computation is greatly 
reduced and a good low-rank approximation Iy{£,m) in V C 5, with dim(V) <C dim(5), of the sky 
brightness achieved. Such a scenario is visualized in Fig. 1(b). 

2.2 Advantages and disadvantages of the new approach 

In this section, we anticipate several advantages and disadvantages of the approach. Among the 
advantages are: 

• The above generalizes the approach of using a non-uniform Fourier basis, which is still con¬ 
tained by selecting '!'()' = {uij{tij^k),Vij{tij^k)) = i.e. non-uniform 

points in the continuous (u, u)-plane. In this case, only one weight factor is equal to one, while 
the others are zero, and it is apparent from (2.2) that we either have to assume Upq = 0 or 
Wij{tij^k) = 0 (or deal with it using IF-projection [2]). Furthermore, we have to assume 

= 1 or correct for the A-terms using A-projection [1]. 

• By designing a suitable basis so to speak on “a uniform grid”, an image is obtained efficiently 
by a simple inverse transformation. There is no need for gridding the non-uniform samples 
in Fourier space to a equidistant grid in order to perform an FFT, and there is no need to 
correct for the W-terms and A-terms by convolutions. In particular, discrete wavelets bases 
are promising candidates. In this case, the inverse fast wavelet transformation (FWT) is 
performed in 0{PQ) time. Furthermore, the wavelet basis can be tailored to the demands of 
radio astronomical images and the observing instrument. 

However, before the approach becomes practical, several obstacles have to be addressed: 
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• The determination of weight factors might be computationally intractable. Only in 

special cases, we might be able to determine the weights efficiently. In any case, we believe 
that the computation of the weights should be performed in a “offline” stage, instead of 
“online” during an observation. 

• Given certain baselines, which of the desired bases can be well approximated, i.e., have 
dim(5) ~ PQ? Can we use beamforming to obtain the antenna beams that are suitable? 

• Once we reconstructed an image using T“^, all the later processing stages need to be reex¬ 
amined. In particular, for a Fourier basis, we know that the dirty image is the sky intensity 
convolved with the PSF of the instrument. Similarly, in the new approach, the weights define 
a PSF that must be approximated and used for later deconvolution. This computational step 
needs still to be investigated within the new approach. 

2.3 How to approximate a basis 

A challenging question is the determination of weights such that (2.5) is satisfied for the 

basis vectors Tq that are most relevant to an image. Defining m) as the complex exponential 

g-2m(uij{tij^k)ip+'Vijitij,k)mq+wij{tij^k)npq)^ fQj. g^ch dual basis vector we need to determine 

arg min {£, m) - V aij^k Fi) ° m) || (2.6) 

{o’i.J.fcl . . , '^—1 _„ _✓ 

i,j,k '' 

“measurement vectors” 

for some norm || • ||, with element-wise Hadamard product o, and 'I'q, F^jl of size P x Q. For 
both efficiency and accuracy, the sum should generally he restricted to a (small) subset of possible 
values. For later reference, we denote the terms ° Fij \ as measurement vectors; therefore, we 
search for linear combinations of measurement vectors that represent the dual basis vectors well. 

For instance, by choosing || • \\f, (2.6) becomes a linear least squares problem (LLP). However, 
other requirements such as sparsity of the solution might be important, or even analytic solutions 
possible for a certain choice of {'1'^}. The general approach for obtaining the weight factors is 
outlined in Algorithm 1. The inputs are a desired basis {'I'o}, a set of baselines via {F^jj.}, 
corresponding antenna beams and tolerances {(5 q}. 


Algorithm 1 FmDWEiGiiTs{{^n},{Fljl},{A\’".^ffi,{Sn}) 

for 7 G F C D do 

For dual basis vector solve (2.6) to obtain a set of weights s.y = {aij^k}- 
if satisfy (2.5) for S.y then 

Store the weights s.y for determination of coefficient 
else 

Mark as unfeasible by setting = 0. 

end if 
end for 
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2.4 Compute costs 

Ideally the entire imaging procedure requires only 0{PQ) [at most, 0{PQ ■ log 2 PQ))] operations. 
This includes the creation of a dirty image, as well as the corrections for system noise and atmo¬ 
spheric effects. In this document, we only look a the first step. As there exist inverse transformations 
T~^ that only cost 0{PQ) [or 0{PQ • log2P(5)] operations, we must be able to limit the compu¬ 
tation of the coefficients to 0{PQ) [or 0{PQ ■ log 2 PQ)] operations as well. Assuming the weight 
factors are known a priori, each of the generalized visibilities can only be a linear combination of 
0(1) [or 0(log2 PQ))] of visibilities. We give motivation below, why we believe that, in certain 
situations, this might be possible.^ Once we have created a first image, the transformation between 
0-space and image domain is performed in 0{PQ) [or 0{PQ • log 2 PQ)] time. Any algorithm for 
corrections for system noise and atmospheric effects that performs this transformation a limited 
number of times, with 0{PQ) [or 0{PQ • log 2 PQ)] work in-between, would achieve the desired 
result. 

3 The simplified one-dimensional case 

3.1 The new setting 

From here on, we concentrate on the easier one-dimensional case, as we believe that the simplest 
case should be understood thoroughly first, before making any (possibly incorrect) conclusions that 
the approach is not feasible in practice or too computationally expensive. 

In the new setting, the brightness distribution I{i) only depends on one angular coordinate I 
and, after discretization, baselines are given by {(ujj(Qj^fc),rcjj(Qj^fc))}.® To further simplify the 
problem, we assume that the phase reference is at zenith, and the antennas are in a plane such that 
Wij{tij^jf) = 0 for all baselines and all time. The baseline for antenna i and j is now described by 
only its u-values during time, {uij(Qj,fc)}. If the antenna responses Aij^kiQ are equal for all antenna 
pairs or time-independent, it does not matter if the samples are taken by one baseline at different 
times or many baselines at one moment in time. The sampling in u-space is simply given by a set 
of values {u;,} = {u-b, ■ ■ ■, U- 2 ,U-i,ui,U 2 ,..., ub}, with Uh = —u^h, and where b is considered to 
be an index of the baseline or time. The corresponding antenna responses are denoted Ah(£). 

With all these simplifications, (2.1) becomes 

p 

V{ub) = Y, A£/(4) a,(4) . (3.1) 

p=i 

For constant Ah{£) = A(i), (3.1) describes the connection between visibilities F({uf,}) and observed 
intensity !{£) o A{t) as a non-uniform discrete Fourier transformation (NDFT). Forming linear 
combinations of the visibilities results in 

B P B p 

c(^n)= aW(u6) = ;^A£J(4) afcAfc(Q) ^ ^ j(^^) ^* (^^) ^ (3 2) 

b=—B p=l b= — B p=l 

b^O b^O 

^ ■ V-^ 

new generating vector bn {(■) 

®We simply assume that it is possible. It has still to be shown that this assumption is valid. 

®Here and in the following, we drop all the dependence on frequency v. Furthermore, the intensity I{t) corresponds 
to the modified intensity, i.e. the true intensity is — P. 
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where we assume that {'4’n} is a unitary basis. (This assumption is made throughout the rest of 
the document.) Eq. (3.2) corresponds to (2.2) and defines transformation T between generalized 
visibilities and intensity distribution as in (2.4): C{iPq) = T{I). 


3.2 Determining the weight factors 

To determine weight factors {cr;,}, we need to minimize the norm of the residual ~ 00; 

i.e. solve 

argmin||0Q(f) - '^abAb{i) o =. argmin||0O - s*Mn\\ , (3.3) 

{(^b} ^ " -V-' O- 

^ “measurement vectors” 

where Mq G 2 < L < 2B, with a subset of measurement vectors as rows, and s G 

containing the weights. Depending on the ratio of the number of samples in the sky, P, and the 
number of samples in n-space, L, the system can be over-determined or under-determined. As B 
grows as fast as (^), for Y antennas, both situations are possible. 

In this document we do not discuss the efficient solution of (3.3). However, even the best case of 
orthogonal T, the error in the dirty image is of the same magnitude as the error in the coefficients, 
which are formed as weighted visibilities. Thus, the error is not only determined by the uncertainty 
in the measured visibilities, but also but also by the error in the computed weight factors. In order 
to control it, we need to control the error in the weights. 

Let So be the solution to (3.3) and K 2 {M^) be the condition number of Mq (both for the || • 10- 
norm). If s denotes the computed weights and, as we assume that only results with “small” residual 
norm are accepted, we conjecture that 


|So - gib 

II II2 


K2{Mq) 


f W^nh , ||AM^\ 

V Unh ^ IlMdb J 


< K2{MQ)6ft . 


(3.4) 


In this case, the error is dominated by the uncertainty in Tq, which, when of the order of do, 
contains a solution with zero residual. The details need to be worked out, but Eq. (3.4) suggests 
that the rows of Mq (the measurements) need to be selected not only such that their span is close 
to 0Q, but also in such way that the condition number K 2 {Mq) does not become too large.^ 


4 Numerical examples 

4.1 The setting 

In this section, we consider the simplified case described on Section 3. In order to specify the visi¬ 
bilities analytically, we define intensity distributions I{i) that have an analytic Eourier transform. 
In particular, we consider the sum of Gaussians; 

l(^) = with Ck = ^ , (4.1) 

k 

and 

- 2 2 

Tf TT u 

e (4.2) 

Ck 

^Furthermore, the measurement vectors might be selected by the quality of the measured visibility, i.e. its variance. 
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For a smooth distribution we choose { 01 , 02 } = {60,100}, {iTi,cj 2 } = {0.04,0.02}, and 

{Pi)P 2 } = {—0.4,0.2}. For a more point source like situation we choose { 01 , 02 , 03 } = 

{70,100, 60}, { 01 , 02 , 03 } = {0.005,0.01,0.005}, and {j 3 i,P 2 ,Ps} = {-0.4, -0.05, 0.3}. 

We quantify the error in the computed brightness distribution lMethod{(-) by 

jp _ — lMethod{^)\\2 

However, a smaller error does not necessarily mean a better result. The dirty image is the true 
intensity convolved with a PSF. A clean image that should recover the true intensity is obtained 
by subsequent deconvolution. 

4.2 The principle and low-rank approximations 

In this section, we show how a low-rank approximation to the image is constructed in an arbitrary 
basis. We choose a basis that is convenient to demonstrate the idea, but not suitable for general 
use. In particular, we use the Hadamard-Walsh basis [3], as (1) a fast transformation requiring 
only 0{P • log 2 P) operations exists, and ( 2 ) it is orthogonal.® 

We define a basis for our approximation subspace S = span{V'A}) by choosing a subset of 
the Hadamard-Walsh basis {V'n}; A C fl. We discretize the sky with P = 1,024 points and 
choose a normalized version of every 16th row of the Hadamard matrix Hp. This is equivalent to 
using the rows of Hp as our basis {ipn} for the entire image space and setting all coefficients for 
n ^ A = {0,16,32,...} to zero.® 

In a more practical setting, we would construct the basis of S based on the measurement vectors, 
which are given for specific antenna positions and responses, such that (3.3) has a small residual. 
However, to illustrate the idea, we further simplify the situation depicted in Section 3 by assuming 
Aff(£) = 1.^® In this case, for the || • || 2 -norm, the weight factors {cfe} are determined by an adjoint 
NDFT. By assuming the u-values are on the grid, which is defined by the discretization of £, without 
loss of generality, we can use an FFT to compute the weight factors. Furthermore, for each basis 
vector, we assume that we measured the visibilities for the u-values that correspond to the r/ G N 
largest coefficients of the discrete Fourier spectrum. In fact, we only use these rj coefficients, which 
corresponds to restricting the sum in (3.3) to 2r] terms. We remark that all those assumptions and 
simplifications are made to demonstrate the idea for the specific choice of basis and do not have to 
be made in general. 

In this simplified setting, each ■i/’A is approximated by a finite Fourier sum as depicted in 
Fig. 4.1 for As the residual norm, 11^^128 — '0i28||) is 0.165 and 0.092 for r/ = 5 and r/ = 15, 

respectively, the basis vector is actually not very well approximated. Despite the relatively bad 
approximation, especially for some of the other basis elements, we obtain reasonable results for the 
Hadamard-Walsh coefficients and the reconstructed images, which are shown in Fig. 4.2. 

The overall compute cost are as follows: the computation of the Hadamard-Walsh coefficients 
requires dim(5) • lOr/ = ^r]P real floating point operations (flops). Thus, the image reconstruction, 
including the inverse transform, requires only 0{P • log 2 P) flops. 

®That is, 'bn = 'bn = • 

®We used scipy.linalg.hadamard(P) to construct Hp and thus start row indexing with zero. 

^®As we discuss below, as each coefficient is a linear combination of only a few visibilities, we are not able to capture 
finer structures very well. 

^^In fact, we convolved ^>128 with ({, {, {, }) before finding the Fourier coefficients. 
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Figure 4.1: Approximation of ' 0 i 28 by V’i 28 - In 







.... 


. 































- 


. 


.... 


.. . 



-0.04 i-'-^^-1 

-1.0 -0.5 0.0 0.5 1.0 

Direction cosine I 


(b) 7? = 15 

case, V ’128 is a sum of r] Fourier components. 




(a) 77 = 5 


(b) 77 = 15 


Figure 4.2: The true intensity is given by For 77 = 5 and rj = 15, the error of 0.36 

and 0.25, respectively, and must be compared to the approximation error Es = 0.24 for the perfect 
image in S. For a better choice of a basis and S, the images would look visually more 

appealing. 


To summarize the low-rank approximation procedure: for a given basis {'ijjn}, we find the 
weights in (3.3) and corresponding residuals. All the vectors that are well-approximated (residuals 
smaller than a certain tolerance), define our approximation subspace S. The coefficients are found 
by forming linear combinations of the measured visibilities, before a final inverse transform gives 
the image approximation. The choice of basis is crucial, as dim(5) ^ P might be problematic if S 
does not contain a good approximation to the true image (i.e., the coefficients that are not obtained 
are not close to zero). If we have further knowledge about the coefficients, we are able to reduce 
the computations even further by only considering a subspace V C 5. 
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(a) Large scale structures. 


(b) Medium scale structures. 


Figure 4.3: Naturally, antennas that are sensitive in all directions, = 1, are good to provide 

information of large-scale structures of the sky, while more localized information is obtained from 
antennas that are mostly sensitive to the direction of interest, (a) If Wi/jq — '^nll/IIV’oll = 0.035 
is considered sufficiently accurate, only 15 visibilities are combined to obtain the corresponding 
wavelet coefficient, (b) For finer structures, to obtain a similar approximation accuracy (HV'o — 
'^nll/IIV’oll = 0.048), already 10 times as many visibilities must be considered. For even finer details, 
this becomes inefficient. 


4.3 Wavelet transforms 

Wavelets promise to provide suitable bases.They have many desirable features: (1) fast trans¬ 
formations requiring only 0{P) time exists; (2) they can be tailored to the specific domain-, (3) 
they can be “smooth” and are localized; (4) they have an underlying continuous transform and 
come with a vast amount of theory; and, lastly, (5) since a basis is constructed by a scaling and 
translating a mother wavelet, it is possible to reduce the amount of computations needed to deter¬ 
mine the weight factors (e.g., across different frequency bands and for translations). As wavelets 
have clear advantages over Fourier basis to represent sharp peaks, radio astronomical images in 
a suitable wavelet domain should be relatively sparse. All those facts, make them an interesting 
family of bases to study. 

4.3.1 Why the A-terms (and rc-terms) matter 

In the previous section, we assumed Afe(£) = 1, i.e. antennas that observe the entire sky from all 
directions equally well. However, such (idealized) antennas are problematic, when the chosen basis 
{f’n} consists of vectors with rather compact support. The problem is best illustrated when assum¬ 
ing w = 0. For Ab{i) = 1 and a specific -ifn, weight factors correspond to Fourier components of 
However, if -i/jq has very compact support, many Fourier components must be considered to get a 
good approximation. Consequently, many visibilities must be combined to get an approximation to 
the wavelet coefficient. This would be inefficient. The phenomenon is illustrated in Fig. 4.3 for two 
basis vectors of the syni6 symlets and P = 1,024. It becomes inefficient to obtain the coefficients 

^^Throughout this section, we used the Py Wavelets python package [7]. 
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for wavelets that capture the more detailed structures, Fig. 3(b), as many visibilities need to be 
combined.However, for large scale structures. Fig. 3(a), only a few visibilities must be combined 
to compute the wavelet coefficient, which is very efficient. 

If we have the antenna beams that cannot be approximated as being equal to one, or 

non-zero rc-terms, we need to find a way to approximate basis elements V’n by a finite sum of 
Aii{£) o g- 2 vri(tii,£-i-«)i,n) only a few terms. We want to provide some intuition, why we believe 
this can be possible, and why certain wavelets might provide a good basis. 

Both the H-term, for now assumed to have compact support^^, and the rc-terms correspond 
to a “smearing” in the Fourier domain. That is, we do not measure a single Fourier component 
anymore, but all Fourier components in a frequency interval with a certain weighting. (The AW- 
projections reflect this by applying convolutions to the visibilities.) But such compactly supported 
Fourier spectra actually constitute wavelets. As a simple example, if has Gaussian shape, 
g-a(£-fo) ^ Q g g [—1,1], and re = 0, a single antenna pair measures a Gabor wavelet 

coefficient. However, Gabor wavelets are non-orthogonal and the question of completeness of the 
representation must be addressed [4].^^ Furthermore, we have the problem of having our data 
points not on a “uniform grid” and some form of gridding would be necessary. 

Other wavelets are approximated by suitable design of antenna beams, Ai){l)}^ Similarly, when 
Afe(£) and baselines are given, we might construct a suitable wavelet such that it can be represented 
by a sum of a few measurement vectors. In any case, the sum in (3.3) is severely restricted by 
leaving out baselines for which Ai,{£) is close to zero in the region of the wavelet support; that is, 
we only take antennas into account that are sensitive to the region of the sky that is important to 
determine the wavelet coefficient. 

Purely based on this intuition, we believe that specific wavelets that do not capture details 
below the capabilities of the instrument, can be represented by a limited sum of measurement 
vectors; thereby allowing the efficient computation of the corresponding wavelet coefficient. We are 
currently working on a more detailed analysis. 

4.3.2 Reconstructing images with wavelets transforms 

Although we discussed the important role of the A-terms (and rc-terms), we once again assume 
Af,{i) = 1 and re = 0. This is solely done, as in this case the visibilities are given analytically. In 
contrast, for a more realistic scenario, the visibilities would have to be determined by a simulation. 
As described above, our assumptions imply that the wavelet coefficients for the detailed structures 
are linear combinations of many visibilities. This is inefficient, but in this section, we only want 
to show that it works. Furthermore, we use a rather dense sampling of the u-space, by setting 
y = 35 antennas at positions \k'^, /c = 0,1,..., T — 1 (using normalized wavelength A = 1). The 
dense sampling implies that all elements of basis are well approximated.^'^ (We used tolerance 
= 10 -^) 

^^However, it is efficient to compute the weight factors, as all the translates in a wavelet basis are simple phase 
shifts for the weight factors. 

^■^Else, it might be considered a weighted sum Ai,{£) = l3jAj{£ — Ij) of normalized antenna responses Aj with 
compact support. 

^®We are not aware of studies applying windowed Fourier transforms to radio astronomical imaging. 

^®This can be done via beamforming. 

^^The sampling in ti-space, and a general scenario the {u, v, rcf-coverage, is absolutely crucial to the success of the 
approach, as it is for the classical Fourier-based reconstruction. 
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(a) P = 256 


(b) P = 512 


Figure 4.4: The true intensity is given by The results are normalized such that ||/i(£)||i = 

II-^jvoftWIIi = \\IiDWT{^)h- In this case, = 0.285, = 0.263 {P = 256), and E^^^^ = 

1.2 • 10-12 (p ^ 512). 




(a) P = 256 


(b) P = 512 


Figure 4.5: The true intensity is given by l 2 {^)- The results are normalized such that ||/ 2 (^)||i = 
\\lNDFTi^)h = \\IiDWTi^)h- In this case, = 0-378, E^^^^ = 0.293 {P = 256), and E^^^^ = 

1.2 • 10-3 (P = 512). 


We also mentioned the importance to restrict the sum in (3.3) and that, with the above as¬ 
sumptions, weights for a translated basis element are determined by a simple phase shift. We did 
not make any use of this and solved (3.3) in a brute-force way instead. To keep the computational 
cost within limits, we restricted P to 256 and 512. 

For the two intensity distributions Ii{i) and hi^), the results are presented in Figs. 4.4 and 
4.5, respectively. With a single wavelet transform, we are able to reconstruct the intensity quite 
well. However, in this section with its simplified assumptions, we only demonstrate a proof of 
concept: images are created by an inverse discrete wavelet transform! The details, optimization. 
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Table 5.1: Traditional vs. generalized approach. 



Traditional 

Generalized 

Basis 

^ — m) 


Dirty image 

Inverse Fourier transform of 
sampled visibilities V{u,v): 
T-\V{u,v)) 

Performed by gridding -|- FFT. 

Any inverse transform, 
depending on basis {vl'n}: 

T-\C{^n)) 

(e.g., FWT) 

TTF-terms 

Corrections 
via convolutions 

Included in the 
construction of the basis 

Clean image 

Deconvolution with PSF 

Deconvolution with PSF 
defined by weights 


and practical issues must still be investigated. 

5 Open questions and future research 

Before concluding, we give a short comparison of the traditional and the generalized approach in 
Table 5.1. The generalized procedure is far from being fully developed. Before it can be used in 
practice, a number of questions need to be addressed. At this point, we collect a few of them: 

• For a given antenna configuration and the radio astronomical application domain, which basis 
is “optimal”? Or, can we co-design the placements and beams of antennas and the image 
basis? Is there a design that allows for sparse weighting vectors? 

• How to determine weights for a specific choice of basis and antenna configuration efficiently? 

• For which use cases could the approach be beneficial? (E.g., low-resolution all-sky imaging.) 

• How do errors (discretization, approximation, algorithmic) influence the final result? 

In the future, we plan to address these (and more) questions in order to work out the approach 
in detail for a realistic scenario. In Appendix A, we describe the procedure in matrix form, which 
is more suitable for a detailed analysis. 

6 Conclusions 

We generalize the procedure of obtaining a radio astronomical sky image by an non-uniform discrete 
Fourier transformation. As a result, in our generalized approach, many transformations are possible, 
giving raise to whole new space to explore. In particular, we demonstrate how an image is obtained 
by a fast discrete wavelet transform. As such transformations can be tailored to the application 
domain and performed in linear time, they promise to be one of the ingredients of real-time, wide- 
held imaging algorithms. 
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A Matrix notation 

In this section, we investigate the one-dimensional case, which was described in Section 3, in a 
different notation. Let i = /(£) G be the intensity (or image), v = R({ub}) G the set of 
measured visibilities. Eq. (3.1) then becomes 

V = Ml, (A.l) 

where M G is the measurement matrix, with and p = 1, 2,..., P as its 

rows. With the pseudoinverse M\ we have a solution i = (or select any of the solutions 

— (/ — M^M)q for all q G C^). The only problem is that solving (A.l) is generally expensive. 
Ideally we would have the following situation: let T* G be any unitary matrix, often real 

and orthogonal, and the system 

g = T*i, (A.2) 

solvable in 0{P) or 0{P log 2 P) time. If we could estimate g G in comparable time, the imaging 
problem could be solved very efficiently. Similarly, if 

c = f*i (A.3) 

for known c G and T* Ri T*, we can solve c = T*i to obtain an approximation to the image. 

Let Wj G j = 1, 2,..., P, be a vector of unknown weights. By forming 

w*v = w*Mi, (A.4) 

and comparing to (A.3), we find that if w*M approximates well the j-th row of T*, i/;*, Cj = w*v is 
a good approximation to gj = This statement is quantified by the Cauchy-Schwartz inequality, 
as 

\9j - Cj\ = KV'j - w*M)i \ < IIV’* - u;*M|| 2 ||r ||2 • (A.5) 

If we require a certain accuracy in the coefficient approximation, (A.5) gives an idea on how big 
— WjM \\2 is tolerable.^® In general, we require 

\\rj-w*M\\<5, (A.6) 

to consider Cj and accurate approximation to gj. If (A. 6 ) does not holds, we set Cj to zero. 

(A.5), the approximation error can be large in parts of the shy where there are no strong sources. 
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Without any loss of generality, assume (A.6) holds for j = 1, 2,..., 5 and not for j = 5+1, ..., P. 
Let W* G £,Sx 2 B rows w* and partition (A.3) as follows 


Cl 

0 



(A.7) 


where ci G C^, G f| G In (A.7), T| G is the part of T* that is 

formed by all the ip* that do not full-fill (A. 6 ). 

Our approximate image z in 5 = span{i/^j : j = 1, 2,..., 5} is obtained by solving 


Cl 

0 



(A. 8 ) 


in 0{P) or 0 (Plog 2 P) time, i.e. i = Tici. We introduce an error by replacing Ti by Ti. However, 


||(ri-fi)ci||2 


< 


h2 


WTpih 


Cl 2 


< \\T^ - W*M\\2 . 


(A.9) 


But, ||Tj^ — W*M ||2 is easily bounded as (A. 6 ) holds for every row. 

There are a number of questions that need to be addressed: (1) The existence of W with 
5 ~ P; (2) the sparsity of W such that forming ci = W*v is performed in 0{P) or 0{P ■ log 2 P) 
operations. Both questions are highly coupled with the specific form of measurement matrix M and 
transformation T. In this paper, we do not address these questions. We simply state the problem 
again; Does there exists a unitary T G and a sparse matrix W* G , where g = T*i is 

solvable in 0{P ■ log 2 P) time, and for any subset of S rows of T* (ideally, S ps P), T( G P^'^P , 
not only \\Tf — IT*M|| < 6, for some toleranee 6, but also W*ci is performed in 0{P•log 2 P) time? 

Also, why can the sketched approach be fast, if the determination of W is potentially expensive? 
The answer is that W is determined independent of the image that is created. That is, the weights 
are determined before an observation starts (“offline”). The idea is as follows; before an observation, 
we know how M will involve during time. More rows are added as measurements are performed. 
For each moment of the observation, we determine a matrix W such that ci = W*v, where values 
of V and rows W are added for the measurements. If we have domain-specific knowledge that 
some coefficients of ci are close to zero these coefficients are not computed, also the weights W 
(the associated columns) do not need to be determined. In this way, when starting an observing, 
we obtain a better and better image as time evolves. In a steady state, depending on how much 
measurement from the past are included in the image creation (old measurements are removed 
from M and v), the observation is showing the sky integrated during a certain time window. 
During the observation (“online”), as W is known and assumed to be sparse, the image creation is 
computationally efficient. 
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